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Attorney Docket: 225/50731 

PATENT 

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



Applicant: CARSTEN KNOEPPEL ET AL 

Serial No.: NOT YET ASSIGNED PCT NO.: PCT/EPOO/05337 

Filed: DECEMBER 11, 2001 

Title: METHOD OF DETECTING OBJECTS WITHIN A WIDE 

RANGE OF A ROAD VEHICLE 



PRELIMINARY AMENDMENT 

Box PCT 

Commissioner for Patents 
Washington, D.C. 20231 

Sir: 

Please enter the following amendments to the specification and claims, as 
amended by way of Annexes to the International Preliminary Examination 
Report for PCT/EPOO/05337, prior to the examination of the application during 
the U.S. National Phase. 



IN THE SPECIFICATION : 

Submitted herewith is a substitute specification and marked-up copy 

thereof which includes the changes made by way of the Annexes to the 
International Preliminary Examination Report. 
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IN THE CLAIMS : 

Please cancel claims 1-9 presently in the application and substitute new 

claims 10 - 21 as follows: 

1 

-10. (new) A method of detecting objects in a vicinity of a road vehicle 
up to a considerable distance, in which a distance from a moving or stationary 
vehicle to one or more objects is calculated by distance-based image 
segmentation using stereo image processing, and characteristics of the detected 
objects are determined by object recognition in the segmented image regions, the 
method comprising the acts of: 

determining image regions of elevated objects and/or flat objects; 

detecting elevated objects and/or flat objects by combining 3D points 
in accordance with predetermined criteria, the elevated objects being determined 
through features with similar distance values and the flat objects being 
determined through features with similar height values; 

tracking over time relevant detected objects and determining the 
distance and lateral position of the relevant detected objects relative to the road 
vehicle in order to assess dynamic behavior of the relevant detected objects; 

determining object hypothesis for performing object recognition, 
said object hypothesis being verified by comparison with object models; 

scanning segmented image regions in accordance with 
predetermined, statistically verified 2D features of particular relevant detected 
objects to be recognized; and 
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comparing the particular relevant detected objects using a neural 
network for classifying a specific object type. 

% 

yi. (new) The method according to claim 10, wherein elevated relevant 
detected objects are road vehicles and flat relevant detected objects are road 
markings and boundaries. 

1^2. (new) The method according to claim 10, further comprising the act of 
determining a relative position and a relative speed of the relevant detected 
objects relative to one another and to the road vehicle by evaluating a distance 
measurement, in order to determine an accurate road-lane object association, 

yZ. (new) The method according to claim 12, wherein the relative position 
and the relative speed of the relevant detected objects are determined in order to 
assess a relevance of the detected objects to a particular situation. 

(new) The method according to claim 11, further comprising the act of 
determining a relative position and a relative speed of the relevant detected 
objects relative to one another and to the road vehicle by evaluating a distance 
measurement, in order to determine an accurate road-lane object association. 
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15. (new) The method according to claim 14, wherein the relative position 
and the relative speed of the relevant detected objects are determined in order to 
assess a relevance of the detected objects to a particular situation. 

J^, (new) The method according to claim 10, further comprising the acts 

of: 

scanning one of recorded pairs of stereo images for significant 
features of objects to be registered; 

determining a spacing of the significant features by comparing 
respective features in a stereo image from a pair of stereo images with the same, 
corresponding features, in the other stereo image from the pair of stereo images 
recorded at the same time; and 

wherein disparities that occur are evaluated via cross correlation 

techniques. 

A 

17. (new) The method according to claim 11, further comprising the acts 

of: 

scanning one of recorded pairs of stereo images for significant 
features of objects to be registered; 

determining a spacing of the significant features by comparing 
respective features in a stereo image from a pair of stereo images with the same, 
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corresponding features, in the other stereo image from the pair of stereo images 
recorded at the same time; and 

wherein disparities that occur are evaluated via cross correlation 

techniques. 

(new) The method according to claim 12, further comprising the acts 

of: 

scanning one of recorded pairs of stereo images for significant 
features of objects to be registered; 

determining a spacing of the significant features by comparing 
respective features in a stereo image from a pair of stereo images with the same, 
corresponding features, in the other stereo image from the pair of stereo images 
recorded at the same time; and 

wherein disparities that occur are evaluated via cross correlation 
techniques. 

>9. (new) The method according to claim 16, wherein by determining the 
spacing of the significant features in a pixel range, 3D points in the road vehicle 
environment are determined relative to a coordinate system of a measuring 
device performing the detecting method. 
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^20. (new) The method according to claim 10, wherein said objects are 
detected by at least one of radar, infrared sensing, and stereoscopic or mono 
sensing. 




(new) A method of detecting and recognizing an object in a vicinity of 



a road vehicle, the method comprising the acts of: 

performing distance-based image segmentation to calculate a 
distance from the road vehicle to an object to be detected; 

scanning the segmented image regions in accordance with 
predetermined, statistically verified 2D features of the object to be detected; and 

comparing the detected object using a neural network for classifying 
it as a specific object type.— 

IN THE ABSTRACT : 

Please add an Abstract of the Disclosure submitted herewith on a separate 

page, 

(Applicants' remarks are set forth herein below starting on the 
following page). 
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REMARKS 



Entry of the amendments to the specification and claims, as amended by 
way of Annexes to the International Preliminary Examination Report for 
PCT/EP0005337, before examination of the application in the U.S. National 
Phase is respectfully requested. 

If there are any questions regarding this Preliminary Amendment or this 
application in general, a telephone call to the undersigned would be appreciated 
since this should expedite the prosecution of the application for all concerned. 

If necessary to effect a timely response, this paper should be considered as 
a petition for an Extension of Time sufficient to effect a timely response, and 
please charge any deficiency in fees or credit any overpayments to Deposit 
Account No. 05-1323 (Docket #225/50731). 



CROWELL & MORING, LLP 
P.O. Box 14300 
Washington, DC 20044-4300 
Telephone No.: (202) 624-2500 
Facsimile No.: (202) 628-8844 

JDS:pct 



Respectfully submitted. 



December 11, 2001 




Jeffrey D^^S^no^ 
Registration No. 32,169 
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-ABSTRACT OF THE DISCLOSURE 

The invention relates to a method of detecting objects within a wide range 
of a road vehicle. According to said method, the distance between a moving or 
stationary vehicle and one or more objects is calculated by distance-based image 
segmentation using stereoscopic image processing techniques and the properties 
of the detected objects are determined by object recognition in the segmented 
image areas. Image areas of three-dimensional and/or flat objects are detected 
and said three-dimensional and/or flat objects are detected by clustering 3D 
pixels according to defined criteria. Three-dimensional objects are determined 
by features with similar distance values and flat objects by features with similar 
height values . — 
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Method of detecting objects in the vicinity of a road vehicle 

up to a considerable distance 

BACKGROUND AND SUMMARY OF THE INVENTION 

[0001] The invention relates to a method of detecting 

objects in the vicinity of a road vehicle up to a considerable 
distance, in which the distance from a moving or stationary 
vehicle to one or more objects is calculated by distance-based 
image segmentation using stereo image processing, and 
characteristics of the detected objects are determined by 
object recognition in the segmented image regions. Image 
regions of elevated objects and/or flat objects are determined 
and the elevated objects and/or flat objects are detected by 
combining (clustering) 3D points in accordance with 
predetermined criteria. The elevated objects are determined 
through features with similar distance values and the flat 
objects are determined through features with similar height 
values. The relevant objects are followed over time 

(tracking) and their distance and lateral position relative to 
the particular vehicle is determined in order to assess the 
dynamic behavior of the relevant objects. 

[0002] In order to assist the driver of a motor vehicle in 

road traffic, driver assistance systems have been developed, 
which are suitable for detecting situations in the road 
traffic which are anticipated to be hazardous. Such driver 
assistance systems can either warn the driver, on the basis of 
his behavior, or intervene in the management of the vehicle. 
The intention here is to increase driving safety, to relieve 
the driver of monotonous driving tasks and, therefore, for 
driving to become more convenient . 

[0003] On account of the high requirements on the 
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reliability of systems which increase safety, at the current 
time, it is predominantly convenience systems which are 
available on the market. Examples of this are parking aids and 
intelligent cruise control systems. Driver assistance systems 
which increase safety are intended to register the surrounding 
traffic situation to an ever increasing extent and to take it 
into account . 

[0004] EP 0 558 027 Bl discloses a device for registering 

the distance between vehicles. In the case of this device, a 
pair of image sensors generates an image of an object, which 
is displayed to the driver. One region of this image is 
subdivided into windows. The distances from the driving 
vehicle to the object, which is located in the respective 
window, are registered. In this case, the distances are 
calculated by comparing two items of image information 
recorded by different image sensors in different windows. On 
the basis of the determined distance information, the 
respective object is determined. A grid which divides the 
relevant image region is used. The grid surrounds the object 
to be registered and supplies further image information. The 
symmetry of this image information is determined, and the 
existence of a vehicle travelling in front is predicted by 
determining a level of stability of a horizontal movement of a 
line of symmetry and a second level of stability of the 
distances over time. 

[0005] This known registration device is used for the 

purpose of registering and recognizing vehicles located in 
front of the moving vehicle. The reliable recognition of 
objects is achieved only in the near region, however, since 
there the simple registration of lines of symmetry can be 
carried out with sufficient stability . ^ In the remote region, 
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this simple registration of symmetry is no longer adequate on 
its own because of the low resolution in the image and the 
resulting inaccuracy in the determination of the object. 

[0006] However, high requirements have to be placed on 

reliable object recognition in particular, in order that the 
driver is not given any erroneous information, which can lead 
to erroneous and hazardous reactions. In the case of 
intelligent systems, the vehicle itself could react in a 
manner presenting a traffic hazard, on the basis of the 
erroneous information. Reliable information is imperative, for 
example in accurate-lane recognition of vehicles at a 
considerable distance, both in and counter to the actual 
direction of travel. 

[0007] For the recognition of interesting patterns, 

DE 42 11 171 Al proposes a method which applies the cross 
relation of small singular extracts from the entire pattern of 
interest by means of block-by-block progressive image 
recognition via a trained classification network. 

[0008] DE 43 08 776 C2 discloses a device for monitoring 

the outer space around a vehicle, which is travelling over one 
lane on a road. The lane is defined by extended white lines. 
By means of image processing, the course of the road is 
determined by using three-dimensional position information 
from sections of the white lines. By utilizing the three- 
dimensional position information from the white lines, the 
white lines are separated from three-dimensional objects. For 
each section, the vertical extent of possible objects is 
determined. As a result, the coordinates for three-dimensional 
objects of interest, such as motor vehicles, motor cycles or 
pedestrians, can be defined in the coordinate system of the 
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vehicle. In addition, it is possible to determine which object 
is concerned. 

[0009] The procedure described in DE 43 08 776 C2 for 

monitoring the outer space around a vehicle requires a great 
deal of computation. It is always necessary to determine the 
course of the registered region of the road, in order to be 
able to determine the position of objects in this road course. 
Since only a limited amount of computing power is available in 
a motor vehicle, such a monitoring device is ill-suited to 
practical use. In addition, the known monitoring device is 
always referred to the presence of white boundary lines, which 
may not be found on the course of all roads. 

[0010] EP-A-0 874 331 discloses the practice of dividing up 

a distance image into regions in the lateral direction away' 
from the vehicle. In this case, a histogram relating to the 
distance values in the individual regions is drawn up, in 
order to determine the distances of individual objects from 
these histograms. The possibility of a collision or contact 
with objects or other vehicles on the roadway is determined 
from the position and size of the objects or vehicles. The 
relative speed of the objects in relation to the particular 
vehicle is determined by tracking the objects. A reliable 
statement relating to the relevance of the objects to the 
situation is possible only after a very computationally 
intensive procedure, which calls a practical application in 
road vehicles into question. 

[0011] The object of the invention is to specify a method 

of detecting objects in the vicinity of a road vehicle up to a 
considerable distance which permits the reliable registration 
of objects, in particular of vehicles in front of and/or 
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behind the road vehicle and their relevance to the situation 
on the basis of its position relative to the road vehicle. 

[0012] According to the invention, this object is achieved 
by determining for the purpose of object recognition, object 
hypotheses, which are verified by comparison with object 
models. Segmented image regions are scanned in accordance 
with predetermined, statistically verified 2D features of the 
objects to be recognized. The detected objects are compared 
by using a neural network for the classification of a specific 
object type. The subclaims relate to advantageous developments 
of the subject of the invention. 

[0013] Accordingly, a method of detecting objects in the 

vicinity of a road vehicle up to a considerable distance is 
provided, in which the distance from a moving or stationary 
vehicle to one or more objects is calculated by distance-based 
image segmentation by means of stereo image processing, and 
characteristics of the detected objects are determined by 
object recognition in the segmented image regions. 

[0014] Determining the characteristics of the detected 

objects is intended to serve to clarify their relevance to the 
particular vehicle and therefore contribute to the 
understanding of the situation. 

[0015] The detection can preferably be carried out to the 

front or to the rear and employed, for example, to warn of 
traffic jams, for distance control from the vehicle in front 
or for monitoring the rear space. In this case, an important 
point of view is that the relevance to the situation or the 
potential hazard of the detected objects is determined from 
their distance to the particular vehicle and the determined 
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relative speed. 

[0016] Instead of evaluating pairs of stereo images, which 

are recorded by a stereo arrangement comprising optical 
sensors or cameras, in principle, even individually recorded 
images of different origin can be evaluated in order to 
determine the distance . 

[0017] Image regions of elevated objects and/or flat 

objects are determined. Elevated objects and/or flat objects 
are detected by combining 3D points in accordance with 
predetermined criteria. Combining is also designated 
clustering. In this case, the elevated objects are determined 
through features with similar distance values and flat objects 
are determined through features with similar height values. By 
means of this procedure, objects can be recognized and 
assessed not only reliably with regard to their distance but 
also with regard to specific features. Distinguishing between 
elevated and flat objects is therefore easily possible, 

[0018] Features of similar distance values and/or similar 

height are combined in order to form clusters. This 
distinction between elevated and flat objects is very 
important for reliable object recognition, for example the 
recognition of other motor vehicles, and the distinction from 
road markings. Since appropriately high computing powers can 
be implemented nowadays in modern motor vehicles, image 
segmentation of this type by means of distance determination 
and clustering can be carried out reliably and quickly. 

[0019] The relevant objects are followed over time and 

their distance and lateral position relative to the particular 
vehicle are determined, in order to assess the dynamic 
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behavior of the relevant objects. Only with knowledge of the 
dynamic behavior of the determined objects are practical 
reactions of the driver or of the vehicle possible. An 
"anticipatory" mode of driving is therefore promoted. 

[0020] Furthermore, by means of this tracking, as it is 

known, phantom objects which occur sporadically can be 
suppressed, and the entire recognition performance can be 
increased. In this way, the number of extracted image regions 
to be classified in the image can be reduced, if these are 
checked for their local consistency by means of simple time 
tracking- By means of tracking the detected objects over time, 
the object characteristics, such as the distance, re'lative 
speed and relative acceleration, can be freed of measurement 
noise, for example by using a Kalman filter. 

[0021] For the purpose of object recognition, object 

hypotheses are determined, which are verified by comparison 
with object models. 

[0022] In this way, for the purpose of object recognition, 

the segmented image regions may be scanned in accordance with 
predetermined, statistically verified 2D features of the 
objects to be recognized, and the detected objects may be 
compared by using a neural network for the classification of a 
specific obj'ect type. In this way, reliable object recognition 
is carried out. 

[0023] The detected elevated objects may be, in particular, 

road vehicles, signposts, bridge columns, lamp posts and so 
on, whereas the detected flat objects may be, in particular, 
road markings and boundaries such as curb stones, crash 
barriers and so on. In this way, for example, the position of 
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a road vehicle on a specific road lane can be determined in a 
simple way. 

[0024] In addition, it is advantageous to know the relative 

position and the relative speed of the detected objects 
relative to one another and to the moving vehicle, in order to 
determine the relevance of the detected objects to the 
situation. To this end, the distance measurement is evaluated, 
and an accurate road-lane object association is determined- 

[0025] During the image segmentation, one of the recorded 

pairs of stereo images can be scanned for significant features 
of objects to be registered. The spacing of the significant 
features may then be determined by means of cross-relation by 
comparing the respective features in a stereo image from the 
pair of stereo images with the same, corresponding features in 
the other stereo image from the pair of stereo images, 
recorded at the same time. The disparities which occur are 
evaluated. 

[0026] By determining the spacing of significant features 

in the pixel range, 3D points in the real world are determined 
relative to the coordinate system of the measuring device. The 
information obtained in this way from 3D points is therefore 
determined from different objects, such as vehicles, road 
markings, crash barriers, and so on. 

[0027] In addition to the above-described stereo-based 

approach, in principle object registration methods based on 
radar and/or infrared signals in the remote range are also 
possible . 
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BRIEF DESCRIPTION OF THE DRAWINGS 

[002 8] Further advantages, features and details of the 

invention become clearer by using the following description in 
conjunction with the appended drawings, in which: 

[002 9] Fig. 1 shows a schematic representation of the 

method steps according to the invention; 

[0030] Fig- 2 shows a schematic representation to clarify 

the principle of the distance determination in the case of 
cameras with the same focal length arranged in parallel; 

[0031] Fig. 3 shows a schematic representation to clarify 

the principle of the correspondence search by means of cross 
correlation; 



[0032] Fig. 4 shows a schematic representation to clarify 

the principle of the 2D feature extraction in the case of 
evaluation by a neural network according to the invention; 

[0033] Fig. 5 shows a schematic representation to clarify 

the principle of coordinate normalization; and 

[0034] Fig. 6 shows a representation of a distance profile 
of an approaching vehicle. 

DETAILED DESCRIPTION OF THE DRAWINGS 

[0035] In the following text, the image segmentation 1 by 

means of stereo image processing is described, during which 
elevated objects 2 are detected. This is carried out through 
clustering 3 individual features with similar distances. Then, 
a vehicle recognition method 5, 6 will be presented, with 
which road vehicles in the segmented image regions are 
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recognized. For this purpose, features typical of vehicles are 
extracted 6 and then compared with the internal vehicle model 
depiction 5 from a neural network 8. The basic procedure is 
shown schematically in Figure 1. 

[0036] Mono image processing is in principle also possible, 

given the use of similar means and a similar procedure. 

[0037] The characteristic that road vehicles are elevated 
by comparison with the road is used for the method of image 
segmentation presented here. To this end, use is made of a 
stereo camera system, with which it is possible to determine 
the distances of significant features which occur in the 
camera image on road vehicles. By means of this information, a 
statement about elevated objects 4 is possible. The 
continually increasing computing power, which is available in 
the vehicle nowadays, permits real-time analysis of pairs of 
stereo images . 

[0038] It is also possible to determine reliably on which 

lane a registered road vehicle is located. It is then possible 
to make a statement about the relevance of this registered 
road vehicle to the situation, on the basis of its position 
relative to the particular vehicle. The driver and/or the 
particular vehicle can then react accordingly. 

[003 9] Although radar systems suitable for vehicles do not 

offer adequate lateral resolution for lane association, 
infrared systems have resolution and range problems and 
ultrasound can generally be used for the near range, it is in 
principle conceivable to employ these systems instead of or in 
combination with stereo camera systems. 
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[0040] The principle of distance determination in the case 

of the parallel camera arrangement used is represented in 
Figure 2 on the basis of the pinhole camera model. The point P 
in the world (camera's field of view) is projected onto the 
sensor surfaces of each camera via the projection centers. Uo 
and Ui represent the deviation from the projection center. 
Their difference 

Au = Uo - Ui 

is designated the disparity Au. By means of trigonometry and 
the sizes of the camera arrangement (focal length f and base 
width b) , the distance d can be calculated. 




Here, jb represents the base width, f the focal length and d 
the distance to the point P. Uo and Ui are the distances of the 
projections of the point P onto the sensor surf ace . 

[0041] In the first processing step in the image 

segmentation, a search for significant features is carried out 
in one of the pairs of stereo images. A corresponding display 

(not shown) on a monitor or another display device may be 
provided only for research purposes. Significant features are 
supplied, for example, by edges, which occur reliably in the 
case of road vehicles. The locations of the selected edges, 
which define the image region to be correlated in the second 
processing step may be marked, for example, by means of 
rectangular frames in the monitor display. 



[0042] In order to determine the spacing of the features 
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displayed on the monitor, the respective disparities are 
determined by comparison with the second stereo image recorded 
at the same time. To this end, a search is made in each 
rectangular image region by means of cross correlation in the 
corresponding image. Figure 3 shows a schematic representation 
to clarify the principle of the correspondence search by means 
of cross correlation 11. 

[0043] On account of the parallel alignment of the cameras, 

the search region in the vertical direction may be restricted 
to the epipolars, the respective line in the case shown in 
Figure 3. In the horizontal direction, the corresponding 
search region is defined in the corresponding image 9, 10 in 
accordance with permissible disparities. 

[0044] By means of using KKFMF (the local, average-free, 

normalized cross correlation function) as the correlation 
function, lightness differences in the pairs of images 9, 10, 
which occur for example as a result of different solar 
radiation or different control of the cameras, have only a 
slight effect on the correlation value. 

[0045] The correlation coefficient from the KKFMF is 

calculated as follows: 




[0046] The values F(irj) and Pr(x + i j ) represent the 
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average -free grey values from the rectangular image regions 
F(i,j) and Pr(x+i,y+j). Because of the normalization, the 
results from the KKFMF move within the interval [-1, 1] - The 
value 1 represents agreement in pairs, -1 represents 
correspondingly inverse agreement . 

[0047] In the last processing step in the image 

segmentation, combining (cluster formation) of features with 
similar distance values takes place (cf. Figure 1). The 
relative height of the clusters formed is compared with a 
fixed minimum height, in order to ensure an elevated object 2. 
In this case, elevated objects are determined through features 
with similar distance values, and flat objects are determined 
through features with similar height values. 

[0048] For research purposes, the resulting clusters can be 

inserted as frames into a (not shown) real monitor display of 
the observed scene. In addition, the distances belonging to 
the segmented image regions may be specified in numerical 
values on the frames . 

[0049] In addition to vehicles, other elevated objects, 

such as sign posts and road margins, are also segmented. In 
order to discard erroneous object hypotheses, the stereo-based 
object segmentation process within the detected image regions 
is followed by 2D object recognition. 

[0050] In the following text, the 2D feature extraction and 

the vehicle recognition will now be described. These 
processing steps are likewise shown in Figure 1. 

[0051] Road vehicles have significant features in the image 

plane, for example edges and corners, as well as symmetry. 
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These features have been determined empirically for the 
purpose of a search, and the recognition of road vehicles is 
carried out by means of direct comparison with a vehicle 
model. In the method shown here, a search is made in 
accordance with statistically verified 2D features 7, which 
are subsequently compared with the internal model depiction of 
vehicles from a neural network 8 . Figure 4 shows a schematic 
representation to clarify the principle of the 2D feature 
extraction during evaluation by a neural network. 

[0052] In order to determine significant and statistically 

verified 2D features 7 of road vehicles, a data set of 50 
images, which show cars in various scenes, was used as a 
basis. By using the method explained below, a determination of 
a plurality of 9x9 large typical patterns, which often occur 
in the scenes used, was carried out (referred to below as 
comparative patterns) . 

[0053] The comparative patterns typically occur at specific 

locations on the vehicle. For example, the features may occur 
in the lower region of the vehicles. At these locations, most 
road vehicles exhibit similar structural areas. These are, for 
example, the shadows under the car and the corners of the 
tires, as well as the course of the structural areas at the 
head lamps. 

[0054] In the segmented image regions, a search window is 

defined in order to calculate the features determined by means 
of the predefined comparative patterns. Depending on the 
distance of the hypothetical object, a search window of 
matched size is defined and correlated with the comparative 
patterns. The locations in the search window which exhibit a 
local maximum of the correlation function identify significant 
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features, as Figure 5 shows. 

[0055] The coordinates of the extrema and the associated 

comparison patterns provide the input features for the feed 
forward network used. This has been trained for the occurrence 
of typical combinations of features which identify vehicles . 

[0056] The real-time method according to the invention for 

the stereo-based tracking of objects at a considerable 
distance has been tried performed in real road scenes. Figure 
6 represents the measured distance data from an approaching 
vehicle- As can be seen in Figure 6, a measurement inaccuracy 
of about ± 50 cm occurs at 100 meters distance. 

[0057] In order to keep the determined distance data free 

of noise and largely free of measurement errors on account of 
erroneously determined correspondences, the use of a Kalman 
filter is suggested, which supplies more meaningful results as 
a result of the consideration of the measured values over 
time. By extending the 2D feature extraction by texture 
dimensions and symmetry operations, further potential is 
provided for improving the method presented. 

[0058] In summary, it is to be recorded that, by using the 

method according to the invention, reliable distance 
determination and recognition of objects, in particular of 
road vehicles in front of and/or behind a travelling vehicle 
is possible up to a considerable distance. 
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Method of detecting objects in the vicinity of a road vehicle 

up to a considerable distance 

[The invention relates to a method of detecting objects in the 
vicinity of a road vehicle up to a considerable distance, 
according to the generic features of Patent Claim 1 . 

In order to assist the driver of a motor vehicle in road 
traffic, driver assistance systems have been developed, which 
are suitable for detecting situations in the road traffic 
which are anticipated to be hazardous. Such driver assistance 
systems can either warn the driver, on the basis of his 
behaviour, or intervene in the management of the vehicle. The 
intention here is to increase driving safety, to relieve the 
driver of monotonous driving tasks and therefore for driving 
to become more convenient . 

On account of the high requirements on the reliability of 
systems which increase safety, at the current time, it is 
predominantly convenience systems which are available on the 
market. Examples of this are parking aids and intelligent 
cruise control systems. Driver assistance systems which 
increase safety are intended to register the surrounding 
traffic situation to an ever increasing extent and to take it 
into account . 

EP 0 558 027 Bl discloses a device for registering the 
distance between vehicles. In the case of this device, a pair 
of image sensors generates an image of an object, which is 
displayed to the driver. One region of this image is 
subdivided into windows. The distances from the driving 
vehicle to the object which is located in the respective 
window are registered. In this case, the distances are 
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calculated by comparing two items of image information 
recorded by different image sensors in different windows. On 
the basis of the determined distance information, the 
respective object is determined. A grid which divides the 
relevant image region is used, surrounds the object to be 
registered and supplies further image information. The 
symmetry of this image information is determined, and the 
existence of a vehicle travelling in front is predicted by 
determining a level of stability of a horizontal movement of a 
line of symmetry and a second level of stability of the 
distances over time. 

This known registration device is used for the purpose of 
registering and recognizing vehicles located in front of the 
moving vehicle. The reliable recognition of objects is 
achieved only in the near region, however, since there the 
simple registration of lines of symmetry can be carried out 
with sufficient stability. In the remote region, this simple 
registration of symmetry is no longer adequate on its own 
because of the low resolution in the image and the resulting 
inaccuracy in the determination of the object. 

However, high requirements have to be placed on reliable 
object recognition in particular, in order that the driver is 
not given any erroneous information, which can lead to 
erroneous and hazardous reactions. In the case of intelligent, 
systems, the vehicle itself could react in a manner presenting 
a traffic hazard, on the basis of the erroneous information. 
Reliable information is imperative, for example in the case of 
the accurate-lane recognition of vehicles at a considerable 
distance, both in and counter to the actual direction of 
travel . 
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For the recognition of interesting patterns, DE 42 11 171 Al 
proposes a method which applies the cross relation of small 
singular extracts from the entire pattern of interest by means 
of block-by-block progressive image recognition via a trained 
classification network. 

DE 43 08 776 C2 discloses a device for monitoring the outer 
space around a vehicle which is travelling over one lane on a 
road, the said lane being defined by extended white lines. By 
means of image processing, the course of the road is 
determined by using three-dimensional position information 
from sections of the white lines. By utilizing the three- 
dimensional position information from the white lines, the 
white lines are separated from three-dimensional objects. For 
each section, the vertical extent of possible objects is 
determined. As a result, the coordinates for three-dimensional 
objects of interest, such as motor vehicles, motor cycles or 
pedestrians, can be defined in the coordinate system of the 
vehicle. In addition, it is possible to determine which object 
is concerned. 

The procedure described in DE 43 08 776 C2 for monitoring the 
outer space around a vehicle requires a great deal of 
computation. It is always necessary to determine the course of 
the registered region of the road, in order to be able to 
determine the position of objects in this road course. Since 
only a limited amount of computing power is available in a 
motor vehicle, such a monitoring device is little suited to 
practical use. In addition, the known monitoring device is 
always referred to the presence of white boundary lines, which 
may not be found on the course of all roads. 

The object of the invention is to specify a method of 
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detecting objects in the vicinity of a road vehicle up to a 
considerable distance which permits the reliable registration 
of objects, in particular of vehicles in front of and/or 
behind the road vehicle and their relevance to the situation 
on the basis of its position relative to the road vehicle. 

According to the invention, this object is achieved by the 
features of Patent Claim 1. The subclaims relate to 
advantageous developments of the subject of the invention. 

Accordingly, a method of detecting objects in the vicinity of 
a road vehicle up to a considerable distance is provided, in 
which the distance from a moving or stationary vehicle to one 
or more objects is calculated by distance-based image 
segmentation by means of stereo image processing, and 
characteristics of the detected objects are determined by 
object recognition in the segmented image regions. 

Determining the characteristics of the detected objects is 
intended to serve to clarify their relevance to the particular 
vehicle and therefore contribute to the understanding of the 
situation . 

The detection can preferably be carried out to the front or to 
the rear and employed, for example, to warn of jams, for 
distance control from the vehicle in front or for monitoring 
the rear space. In this case, an important point of view is 
that the relevance to the situation or the potential hazard of 
the detected objects is determined from their distance to the 
particular vehicle and the determined relative speed. 

Instead of evaluating pairs of stereo images, which are 
recorded by a stereo arrangement comprising optical sensors or 
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cameras, in principle, even individually recorded images of 
different origin can be evaluated in order to determine the 
distance . 

According to a basic idea, image regions of elevated objects 
and/or flat objects are determined. Elevated objects and/or 
flat objects are detected by combining 3D points in accordance 
with predetermined criteria. Combining is also designated 
clustering. In this case, the elevated objects are determined 
through features with similar distance values and flat objects 
are determined through features with similar height values. By 
means of this procedure, objects can be recognized and 
assessed not only reliably with regard to their distance but 
also with regard to specific features. Distinguishing between 
elevated and flat objects is therefore easily possible. 

Features of similar distance values and/or similar height are 
combined in order to form clusters. This distinction between 
elevated and flat objects is very important for reliable 
object recognition, for example the recognition of other motor 
vehicles, and the distinction from road markings. Since 
appropriately high computing powers can be implemented 
nowadays in modern motor vehicles, image segmentation of this 
type by means of distance determination and clustering can be 
carried out reliably and quickly. 

The detected elevated objects may be, in particular, road 
vehicles, signposts, bridge columns, lamp posts and so on, 
whereas the detected flat objects may be, in particular, road 
markings and boundaries such as curb stones, crash barriers 
and so on. In this way, for example, the position of a road 
vehicle on a specific road lane can be determined in a simple 
way. 
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In addition, it is advantageous to know the relative position 
and the relative speed of the detected objects relative to one 
another and to the moving vehicle, in order to determine the 
relevance of the detected objects to the situation. To this 
end, the distance measurement is evaluated, and an accurate 
road- lane object association is determined. 

During the image segmentation, one of the recorded pairs of 
stereo images can be scanned for significant features of 
objects to be registered- The spacing of the significant 
features may then be determined by means of cross-relation by 
comparing the respective features in a stereo image from the 
pair of stereo images with the same, corresponding features in 
the other stereo image from the pair of stereo images, 
recorded at the same time, the disparities which occur being 
evaluated. 

By determining the spacing of significant features in the 
pixel range, 3D points in the real world are determined 
relative to the coordinate system of the measuring device. The 
information obtained in this way from 3D points is therefore 
determined from different objects, such as vehicles, road 
markings, crash barriers, and so on. 

For the purpose of object recognition, object hypotheses can 
be determined, which are verified by comparison with object 
models . 

In this way, for the purpose of object recognition, the 
segmented image regions may be scanned in accordance with 
predetermined, statistically verified 2D features of the 
objects to be recognized, and the detected objects may be 
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compared by using a neural network for the classification of a 
specific object type. In this way, reliable object recognition 
is carried out. 

The relevant objects, can be followed over time and their 
distance and lateral position relative to the particular 
vehicle can be determined, in order to assess the dynamic 
behaviour of the relevant objects. Only with knowledge of the 
dynamic behaviour of the determined objects are practical 
reactions of the driver or of the vehicle possible. An 
"anticipatory" mode of driving is therefore promoted . 

Furthermore, by means of this tracking, as it is known, 
phantom objects which occur sporadically can be suppressed, 
and the entire recognition performance can be increased. In 
this way, the number of extracted image regions to be 
classified in the image can be reduced, if these are checked 
for their local consistency by means of simple time tracking. 
By means of tracking the detected objects over time, the 
object characteristics, such as the distance, relative speed 
and relative acceleration, can be freed of measurement noise, 
for example by using a Kalman filter. 

In addition to the above-described stereo-based approach, in 
principle object registration methods based on radar and/or 
infrared signals in the remote range are also possible. 

Further advantages, features and details of the invention 
become clearer by using the following description in 
conjunction with the appended drawings, in which: 

Fig. 1 shows a schematic representation of the method steps 
according to the invention; 
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Fig. 2 shows a schematic representation to clarify the 

principle of the distance determination in the case of 
cameras with the same focal length arranged in 
parallel ; 

Fig. 3 shows a schematic representation to clarify the 
principle of the correspondence search by means of 
cross correlations- 



Fig . 4 shows a schematic representation to clarify the 

principle of the 2D feature extraction in the case of 
evaluation by a neural network according to the 
inventions- 
Fig. 5 shows a schematic representation to clarify the 
principle of coordinate normalization; and 

Fig. 6 shows a representation of a distance profile of an 
approaching vehicle . 



In the following text, the image segmentation by means of 
stereo image processing is described, during which elevated 
objects are detected. This is carried out through clustering 
individual features with similar distances. Then, a vehicle 
recognition method will be presented, with which road vehicles 
in the segmented image regions are recognized. For this 
purpose, features typical of vehicles are extracted and then 
compared with the internal vehicle model depiction from a 
neural network. The basic procedure is shown schematically in 
Figure 1 . 

Mono image processing is in principle also possible, given the 
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use of similar means and a similar procedure. 

The characteristic that road vehicles are elevated by 
comparison with the road is used for the method of image 
segmentation presented here. To this end, use is made of a 
stereo camera system, with which it is possible to determine 
the distances of significant features which occur in the 
camera image on road vehicles. By means of this information, a 
statement about elevated objects is possible. The continually 
increasing computing power which is available in the vehicle 
nowadays permits real-time analysis of pairs of stereo images. 

It is also possible to determine reliably on which lane a 
registered road vehicle is located. It is then possible to 
make a statement about the relevance of this registered road 
vehicle to the situation, on the basis of its position 
relative to the particular vehicle. The driver and/or the 
particular vehicle can then react accordingly. 

Although radar systems suitable for vehicles do not offer 
adequate lateral resolution for lane association, infrared 
systems have resolution and range problems and ultrasound can 
generally be used for the near range, it is in principle 
conceivable to employ these systems instead of or in 
combination with stereo camera systems. 

The principle of distance determination in the case of the 
parallel camera arrangement used is represented in Figure 2 on 
the basis of the pinhole camera model. The point P in the 
world is projected onto the sensor surfaces of each camera via 
the projection centres. Uq and Ui represent the deviation from 
the projection centre. Their difference 
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Au = Uo - Ui 

is designated the disparity Au. By means of trigonometry and 
the sizes of the camera arrangement (focal length f and base 
width b) , the distance d can be calculated- 

Au 

Here, Jb represents the base width, f the focal length and d 
the distance to the point P. Uq and Ui are the distances of the 
projections of the point P onto the sensor surface. 

In the first processing step in the image segmentation, a 
search for significant features is carried out in one of the 
pairs of stereo images. A corresponding display (not shown) on 
a monitor or another display device may be provided only for 
research purposes. Significant features are supplied, for 
example, by edges, which occur reliably in the case of road 
vehicles- The locations of the selected edges, which define 
the image region to be correlated in the second processing 
step may be marked, for example, by means of rectangular 
frames in the monitor display. 

In order to determine the spacing of the features displayed on 
the monitor, the respective disparities are determined by 
comparison with the second stereo image recorded at the same 
time. To this end, a search is made in each rectangular image 
region by means of cross correlation in the corresponding 
image . 

Figure 3 shows a schematic representation to clarify the 
principle of the correspondence search by means of cross 
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correlation - 

* On account of the parallel alignment of the cameras, the 
search region in the vertical direction may be restricted to 
the epipolars, the respective line in the case shown in Figure 
3. In the horizontal direction, the corresponding search 
region is defined in the corresponding image in accordance 
with permissible disparities. 

By means of using KKFMF (the local, average-free, normalized 
cross correlation function) as the correlation function, 
lightness differences in the pairs of images, which occur for 
example as a result of different solar radiation or different 
control of the cameras, have only a slight effect on the 
correlation value . 

The correlation coefficient from the KKFMF is calculated as 
follows : 



The values F(l^j) and p^CxH-i,y + j; represent the average-free 
grey values from the rectangular image regions F(i,j) and 
Pr {x+i, y+ j ) . Because of the normalization, the results from the 
KKFMF move within the interval [-1, 1] - The value 1 represents 
agreement in pairs, -1 represents correspondingly inverse 
agreement . 



KKFMF{x,y) = 
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In the last processing step in the image segmentation, 
combining (cluster formation) of features with similar 
distance values takes place (cf. Figure 1). The relative 
height of the clusters formed is compared with a fixed minimum 
height, in order to ensure an elevated object. In this case, 
elevated objects are determined through features with similar 
distance values, and flat objects are determined through 
features with similar height values . 

For research purposes, the resulting clusters can be inserted 
as frames into a (not shown) real monitor display of the 
observed scene. In addition, the distances belonging to the 
segmented image regions may be specified in numerical values 
on the frames. 

In addition to vehicles, other elevated objects, such as sign 
posts and road margins, are also segmented. In order to 
discard erroneous object hypotheses, the stereo-based object 
segmentation process within the detected image regions is 
followed by 2D object recognition. 

In the following text, the 2D feature extraction and the 
vehicle recognition will now be described. These processing 
steps are likewise shown in Figure 1. 

Road vehicles have significant features in the image plane, 
for example edges and corners, as well as symmetry. These 
features have been determined empirically for the purpose of a 
search, and the recognition of road vehicles is carried out by 
means of direct comparison with a vehicle model . In the method 
shown here, a search is made in accordance with statistically 
verified 2D features, which are subsequently compared with the 
internal model depiction of vehicles from a neural network. 
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Figure 4 shows a schematic representation to clarify the 
principle of the 2D feature extraction during evaluation by a 
neural network. 

In order to determine significant and statistically verified 
2D features of road vehicles, a data set of 50 images, which 
show cars in various scenes, was used as a basis. By using the 
method explained below, a determination of a plurality of 9x9 
large typical patterns, which often occur in the scenes used, 
was carried out (referred to below as comparative patterns) . 

The comparative patterns typically occur at specific locations 
on the vehicle. For example, the features may occur in the 
lower region of the vehicles. At these locations, most road 
vehicles exhibit similar structural areas. These are, for 
example, the shadows under the car and the corners of the 
tyres, as well as the course of the structural areas at the 
head lamps. 

In the segmented image regions, a search window is defined in 
order to calculate the features determined by means of the 
predefined comparative patterns. Depending on the distance of 
the hypothetical object, a search window of matched size is 
defined and correlated with the comparative patterns. The 
locations in the search window which exhibit a local maximum 
of the correlation function identify significant features, as 
Figure 5 shows . 

The coordinates of the extrema and the associated comparison 
patterns provide the input features for the feed forward 
network used. This has been trained for the occurrence of 
typical combinations of features which identify vehicles. 
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The real-time method according to the invention for the 
stereo-based tracking of objects at a considerable distance 
has been tried out in real road scenes. Figure 6 represents 
the measured distance data from an approaching vehicle. As can 
be seen in Figure 6, a measurement inaccuracy of about ± 50 cm 
occurs at 100 metres distance. 

In order to keep the determined distance data free of noise 
and largely free of measurement errors on account of 
erroneously determined correspondences, the use of a Kalman 
filter is suggested, which supplies more meaningful results as 
a result of the consideration of the measured values over 
time. By extending the 2D feature extraction by texture 
dimensions and symmetry operations, further potential is 
provided for improving the method presented. 

In summary, it is to be recorded that, by using the method 
according to the invention, reliable distance determination 
and recognition of objects, in particular of road vehicles in 
front of and/or behind a travelling vehicle^ is possible up to 
a considerable distance.] 

BACKGROUND AND SUMMARY OF THE INVENTION 

The invention relates to a method of detecting objects in 
the vicinity of a road vehicle up to a considerable distance, 
in which the distance from a moving or stationary vehicle to 
one or more objects is calculated by distance-based image 
segmentation using stereo image processing, and 
characteristics of the detected objects are determined by 
object recognition in the segmented image regions. Image 
regions of elevated objects and/or flat objects are determined 
and the elevated objects and/or flat objects are detected by 
combining (clustering) 3D points in accordance with 
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predetermined criteria. The elevated objects are determined 
through features with similar distance values and the flat 
objects are determined through features with similar height 
values. The relevant objects are followed over time 
(tracking) and their distance and lateral position relative to 
the particular vehicle is determined in order to assess the 
dynamic behavior of the relevant objects. 

In order to assist the driver of a motor vehicle in road 
traffic, driver assistance systems have been developed, which 
are suitable for detecting situations in the road traffic 
which are anticipated to be hazardous. Such driver assistance 
systems can either warn the driver, on the basis of his 
behavior, or intervene in the management of the vehicle. The 
intention here is to increase driving safety, to relieve the 
driver of monotonous driving tasks and, therefore, for driving 
to become more convenient . 

On account of the high requirements on the reliability of 
systems which increase safety, at the current time, it is 
predominantly convenience systems which are available on the 
market. Examples of this are parking aids and intelligent 
cruise control systems. Driver assistance systems which 
increase safety are intended to register the surrounding 
traffic situation to an ever increasing extent and to take it 
into account . 

EP 0 558 027 Bl discloses a device for registering the 
distance between vehicles. In the case of this device, a pair 
of image sensors generates an image of an object, which is 
displayed to the driver. One region of this image is 
subdivided into windows. The distances from the driving 
vehicle to the object, which is located in the respective 
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window, are registered. In this case, the distances are 
calculated by comparing two items of image information 
recorded by different image sensors in different windows. On 
the basis of the determined distance information, the 
respective object is determined. A grid which divides the 
relevant image region is used. The grid surrounds the object 
to be registered and supplies further image information. The 
symmetry of this image information is determined, and the 
existence of a vehicle travelling in front is predicted by 
determining a level of stability of a horizontal movement of a 
line of symmetry and a second level of stability of the 
distances over time. 

This known registration device is used for the purpose of 
registering and recognizing vehicles located in front of the 
moving vehicle. The reliable recognition of objects is 
achieved only in the near region, however, since there the 
simple registration of lines of symmetry can be carried out 
with sufficient stability. In the remote region, this simple 
registration of symmetry is no longer adequate on its own 
because of the low resolution in the image and the resulting 
inaccuracy in the determination of the object. 

However, high requirements have to be placed on reliable 
object recognition in particular, in order that the driver is 
not given any erroneous information, which can lead to 
erroneous and hazardous reactions. In the case of intelligent 
systems, the vehicle itself could react in a manner presenting 
a traffic hazard, on the basis of the erroneous information. 
Reliable information is imperative, for example in accurate- 
lane recognition of vehicles at a considerable distance, both 
in and counter to the actual direction of travel. 
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For the recognition of interesting patterns, 
DE 42 11 171 Al proposes a method which applies the cross 
relation of small singular extracts from the entire pattern of 
interest by means of block-by-block progressive image 
recognition via a trained classification network. 

DE 43 08 776 C2 discloses a device for monitoring the 
outer space around a vehicle, which is travelling over one 
lane on a road. The lane is defined by extended white lines. 
By means of image processing, the course of the road is 
determined by using three-dimensional position information 
from sections of the white lines. By utilizing the three- 
dimensional position information from the white lines, the 
white lines are separated from three-dimensional objects. For 
each section, the vertical extent of possible objects is 
determined. As a result, the coordinates for three-dimensional 
objects of interest, such as motor vehicles, motor cycles or 
pedestrians, can be defined in the coordinate system of the 
vehicle. In addition, it is possible to determine which object 
is concerned. 

The procedure described in DE 43 08 776 C2 for monitoring 
the outer space around a vehicle requires a great deal of 
computation. It is always necessary to determine the course of 
the registered region of the road, in order to be able to 
determine the position of objects in this road course. Since 
only a limited amount of computing power is available in a 
motor vehicle, such a monitoring device is ill-suited to 
practical use. In addition, the known monitoring device is 
always referred to the presence of white boundary lines, which 
may not be found on the course of all roads. 

EP-A-0 874 331 discloses the practice of dividing up a 
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distance image into regions in the lateral direction away from 
the vehicle. In this case, a histogram relating to the 
distance values in the individual regions is drawn up, in 
order to determine the distances of individual objects from 
these histograms. The possibility of a collision or contact 
with objects or other vehicles on the roadway is determined 
from the position and size of the objects or vehicles. The 
relative speed of the objects in relation to the particular 
vehicle is determined by tracking the objects. A reliable 
statement relating to the relevance of the objects to the 
situation is possible only after a very computationally 
intensive procedure, which calls a practical application in 
road vehicles into question. 

The object of the invention is to specify a method of 
detecting objects in the vicinity of a road vehicle up to a 
considerable distance which permits the reliable registration 
of objects, in particular of vehicles in front of and/or 
behind the road vehicle and their relevance to the situation 
on the basis of its position relative to the road vehicle. 

According to the invention, this object is achieved by 
determining for the purpose of object recognition, object 
hypotheses, which are verified by comparison with object 
models. Segmented image regions are scanned in accordance 
with predetermined, statistically verified 2D features of the 
objects to be recognized. The detected objects are compared 
by using a neural network for the classification of a specific 
object type. The subclaims relate to advantageous developments 
of the subject of the invention. 

Accordingly, a method of detecting objects in the 
vicinity of a road vehicle up to a considerable distance is 
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provided, in which the distance from a moving or stationary 
vehicle to one or more objects is calculated by distance-based 
image segmentation by means of stereo image processing, and 
characteristics of the detected objects are determined by 
object recognition in the segmented image regions. 

Determining the characteristics of the detected objects 
is intended to serve to clarify their relevance to the 
particular vehicle and therefore contribute to the 
understanding of the situation. 

The detection can preferably be carried out to the front 
or to the rear and employed, for example, to warn of traffic 
jams, for distance control from the vehicle in front or for 
monitoring the rear space. In this case, an important point of 
view is that the relevance to the situation or the potential 
hazard of the detected objects is determined from their 
distance to the particular vehicle and the determined relative 
speed , 

Instead of evaluating pairs of stereo images, which are 
recorded by a stereo arrangement comprising optical sensors or 
cameras, in principle, even individually recorded images of 
different origin can be evaluated in order to determine the 
distance . 

Image regions of elevated objects and/or flat objects are 
determined. Elevated objects and/or flat objects are detected 
by combining 3D points in accordance with predetermined 
criteria. Combining is also designated clustering. In this 
case, the elevated objects are determined through features 
with similar distance values and flat objects are determined 
through features with similar height values. By means of this 
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procedure/ objects can be recognized and assessed not only 
reliably with regard to their distance but also with regard to 
specific features. Distinguishing between elevated and flat 
objects is therefore easily possible. 

Features of similar distance values and/or similar height 
are combined in order to form clusters. This distinction 
between elevated and flat objects is very important for 
reliable object recognition, for example the recognition of 
other motor vehicles, and the distinction from road markings. 
Since appropriately high computing powers can be implemented 
nowadays in modern motor vehicles ^ image segmentation of this 
type by means of distance determination and clustering can be 
carried out reliably and quickly. 

The relevant objects are followed over time and their 
distance and lateral position relative to the particular 
vehicle are determined, in order to assess the dynamic 
behavior of the relevant objects. Only with knowledge of the 
dynamic behavior of the determined objects are practical 
reactions of the driver or of the vehicle possible. An 
"anticipatory" mode of driving is therefore promoted. 

Furthermore, by means of this tracking, as it is known^ 
phantom objects which occur sporadically can be suppressed, 
and the entire recognition performance can be increased. In 
this way, the number of extracted image regions to be 
classified in the image can be reduced, if these are checked 
for their local consistency by means of simple time tracking. 
By means of tracking the detected objects over time, the 
object characteristics, such as the distance, relative speed 
and relative acceleration^ can be freed of measurement noise, 
for example by using a Kalman filter. 

20 
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For the purpose of object recognition, object hypotheses 
are determined^ which are verified by comparison with object 
models , 

In this way, for the purpose of object recognition, the 
segmented image regions may be scanned in accordance with 
predetermined, statistically verified 2D features of the 
objects to be recognized, and the detected objects may be 
compared by using a neural network for the classification of a 
specific object type. In this way, reliable object recognition 
is carried out , 

The detected elevated objects may be, in particular, road 
vehicles, signposts, bridge columns, lamp posts and so on, 
whereas the detected flat objects may be, in particular, road 
markings and boundaries such as curb stones, crash barriers 
and so on. In this way, for example, the position of a road 
vehicle on a specific road lane can be determined in a simple 
way. 

In addition, it is advantageous to know the relative 
position and the relative speed of the detected objects 
relative to one another and to the moving vehicle, in order to 
determine the relevance of the detected objects to the 
situation. To this end, the distance measurement is evaluated, 
and an accurate road-lane object association is determined. 

During the image segmentation, one of the recorded pairs 
of stereo images can be scanned for significant features of 
objects to be registered. The spacing of the significant 
features may then be determined by means of cross-relation by 
comparing the respective features in a stereo image from the 
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pair of stereo images with the same^ corresponding features in 
the other stereo image from the pair of stereo images, 
recorded at the same time. The disparities which occur are 
evaluated ■ 

By determining the spacing of significant features in the 
pixel range, 3D points in the real world are determined 
relative to the coordinate system of the measuring device. The 
information obtained in this way from 3D points is therefore 
determined from different objects, such as vehicles , road 
markings, crash barriers, and so on. 

In addition to the above -described stereo-based approach, 
in principle object registration methods based on radar and/or 
infrared signals in the remote range are also possible. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Further advantages, features and details of the invention 
become clearer by using the following description in 
conjunction with the appended drawings, in which: 

Fig. 1 shows a schematic representation of the method 
steps according to the invention; 

Fig. 2 shows a schematic representation to clarify the 
principle of the distance determination in the case of cameras 
with the same focal length arranged in parallel; 

Fig. 3 shows a schematic representation to clarify the 
principle of the correspondence search by means of cross 
correlation; 

Fig, 4 shows a schematic representation to clarify the 
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principle of the 2D feature extraction in the case of 
evaluation by a neural network according to the invention; 

Fig, 5 shows a schematic representation to clarify the 
principle of coordinate normalization; and 

Fig, 6 shows a representation of a distance profile of an 
approaching vehicle . 

DETAILED DESCRIPTION OF THE DRAWINGS 

In the following text, the image segmentation 1 by means 
of stereo image processing is described, during which elevated 
objects 2 are detected. This is carried out through clustering 
3 individual features with similar distances. Then, a vehicle 
recognition method 5, 6 will be presented, with which road 
vehicles in the segmented image regions are recognized. For 
this purpose, features typical of vehicles are extracted 6 and 
then compared with the internal vehicle model depiction 5 from 
a neural network 8 , The basic procedure is shown schematically 
in Figure 1 . 

Mono image processing is in principle also possible, 
given the use of similar means and a similar procedure. 

The characteristic that road vehicles are elevated by 
comparison with the road is used for the method of image 
segmentation presented here. To this end, use is made of a 
stereo camera system, with which it is possible to determine 
the distances of significant features which occur in the 
camera image on road vehicles. By means of this information, a 
statement about elevated objects 4 is possible. The 
continually increasing computing power, which is available in 
the vehicle nowadays, permits real-time analysis of pairs of 
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Stereo images , 

It is also possible to determine reliably on which lane a 
registered road vehicle is located- It is then possible to 
make a statement about the relevance of this registered road 
vehicle to the situation, on the basis of its position 
relative to the particular vehicle > The driver and/or the 
particular vehicle can then react accordingly. 

Although radar systems suitable for vehicles do not offer 
adequate lateral resolution for lane association, infrared 
systems have resolution and range problems and ultrasound can 
generally be used for the near range, it is in principle 
conceivable to employ these systems instead of or in 
combination with stereo camera systems. 

The principle of distance determination in the case of 
the parallel camera arrangement used is represented in Figure 
2 on the basis of the pinhole camera model . The point P in the 
world (camera's field of view) is projected onto the sensor 
surfaces of each camera via the projection centers. Up and Ui 
represent the deviation from the projection center. Their 
difference 

Au = Up - U2 

is designated the disparity Au. By means of trigonometry and 
the sizes of the camera arrangement (focal length f and base 
width b) , the distance d can be calculated. 

Au 
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Here, Jb represents the base width, f the focal length and d 
the distance to the point P, Up and Ui are the distances of the 
projections of the point P onto the sensor surface. 

In the first processing step in the image segmentation, a 
search for significant features is carried out in one of the 
pairs of stereo images. A corresponding display (not shown) on 
a monitor or another display device may be provided only for 
research purposes. Significant features are supplied, for 
example, by edges, which occur reliably in the case of road 
vehicles. The locations of the selected edges, which define 
the image region to be correlated in the second processing 
step may be marked, for example, by means of rectangular 
frames in the monitor display. 

In order to determine the spacing of the features 
displayed on the monitor, the respective disparities are 
determined by comparison with the second stereo image recorded 
at the same time. To this end, a search is made in each 
rectangular image region by means of cross correlation in the 
corresponding image. Figure 3 shows a schematic representation 
to clarify the principle of the correspondence search by means 
of cross correlation 11. 

On account of the parallel alignment of the cameras, the 
search region in the vertical direction may be restricted to 
the epipolars, the respective line in the case shown in Figure 
3. In the horizontal direction, the corresponding search 
region is defined in the corresponding image 9, 10 in 
accordance with permissible disparities. 

By means of using KKFMF (the local, average- free , 
normalized cross correlation function) as the correlation 
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function^ lightness differences in the pairs of images 9, 10, 
which occur for example as a result of different solar 
radiation or different control of the cameras, have only a 
slight effect on the correlation value. 

The correlation coefficient from the KKFMF is calculated 
as follows: 



The values F(l^j) and pJ^ + ifY + j) represent the average- 
free grey values from the rectangular image regions F(i,j) and 
Pr(x+i,y+j), Because of the normalization, the results from the 
KKFMF move within the interval [-1, 1] . The value 1 represents 
agreement in pairs, -1 represents correspondingly inverse 
agreement . 

In the last processing step in the image segmentation, 
combining (cluster formation) of features with similar 
distance values takes place (cf ■ Figure 1) , The relative 
height of the clusters formed is compared with a fixed minimum 
height, in order to ensure an elevated object 2, In this case, 
elevated objects are determined through features with similar 
distance values, and flat objects are determined through 
features with similar height values. 

For research purposes, the resulting clusters can be 
inserted as frames into a (not shown) real monitor display of 
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the observed scene. In addition, the distances belonging to 
the segmented image regions may be specified in numerical 
values on the frames . 

In addition to vehicles , other elevated objects, such as 
sign posts and road margins, are also segmented. In order to 
discard erroneous object hypotheses, the stereo-based object 
segmentation process within the detected image regions is 
followed by 2D object recognition. 

In the following text, the 2D feature extraction and the 
vehicle recognition will now be described. These processing 
steps are likewise shown in Figure 1. 

Road vehicles have significant features in the image 
plane, for example edges and corners, as well as symmetry. 
These features have been determined empirically for the 
purpose of a search, and the recognition of road vehicles is 
carried out by means of direct comparison with a vehicle 
model. In the method shown here, a search is made in 
accordance with statistically verified 2D features 7, which 
are subsequently compared with the internal model depiction of 
vehicles from a neural network 8. Figure 4 shows a schematic 
representation to clarify the principle of the 2D feature 
extraction during evaluation by a neural network. 

In order to determine significant and statistically 
verified 2D features 7 of road vehicles, a data set of 50 
images, which show cars in various scenes, was used as a 
basis. By using the method explained below, a determination of 
a plurality of 9x9 large typical patterns, which often occur 
in the scenes used, was carried out (referred to below as 
comparative patterns) . 
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The comparative patterns typically occur at specific 
locations on the vehicle. For example, the features may occur 
in the lower region of the vehicles. At these locations, most 
road vehicles exhibit similar structural areas. These are, for 
example, the shadows under the car and the corners of the 
tires, as well as the course of the structural areas at the 
head lamps. 

In the segmented image regions, a search window is 
defined in order to calculate the features determined by means 
of the predefined comparative patterns. Depending on the 
distance of the hypothetical object, a search window of 
matched size is defined and correlated with the comparative 
patterns. The locations in the search window which exhibit a 
local maximum of the correlation function identify significant 
features, as Figure 5 shows. 

The coordinates of the extrema and the associated 
comparison patterns provide the input features for the feed 
forward network used. This has been trained for the occurrence 
of typical combinations of features which identify vehicles. 

The real-time method according to the invention for the 
stereo-based tracking of objects at a considerable distance " 
has been tried performed in real road scenes. Figure 6 
represents the measured distance data from an approaching 
vehicle. As can be seen in Figure 6, a measurement inaccuracy 
of about ± 50 cm occurs at 100 meters distance. 

In order to keep the determined distance data free of 
noise and largely free of measurement errors on account of 
erroneously determined correspondences, the use of a Kalman 
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filter is suggested, which supplies more meaningful results as 
a result of the consideration of the measured values over 
time. By extending the 2D feature extraction by texture 
dimensions and symmetry operations, further potential is 
provided for improving the method presented. 

In summary, it is to be recorded that, by using the 
method according to the invention, reliable distance 
determination and recognition of objects, in particular of 
road vehicles in front of and/or behind a travelling vehicle 
is possible up to a considerable distance. 
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The invention relates to a method of detecting objects 
in the vicinity of a road vehicle up to a considerable 
distance, according to the generic features of Patent 
Claim 1. 

In order to assist the driver of a motor vehicle in 
road traffic, driver assistance systems have been 
developed, which are suitable for detecting situations 
in the road traffic which are anticipated to be 
hazardous. Such driver assistance systems can either 
warn the driver, on the basis of his behaviour, or 
intervene in the management of the vehicle. The 
intention here is to increase driving safety, to 
relieve the driver of monotonous driving tasks and 
therefore for driving to become more convenient. 

On account of the high requirements on the reliability 
of systems which increase safety, at the current time, 
it is predominantly convenience systems which are 
available on the market. Examples of this are parking 
aids and intelligent cruise control systems. Driver 
assistance systems which increase safety are intended 
to register the surrounding traffic situation to an 
ever increasing extent and to take it into account. 

EP 0 558 027 Bl discloses a device for registering the 
distance between vehicles. In the case of this device, 
a pair of image sensors generates an image of an 
object, which is displayed to the driver. One region of 
this image is subdivided into windows. The distances 
from the driving vehicle to the object which is located 
in the respective window are registered. In this case, 
the distances are calculated by comparing two items of 
image information recorded by different image sensors 
in different windows. On the basis of the determined 
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distance information, the respective object is 
determined. A grid which divides the relevant image 
region is used, surrounds the object to be registered 
and supplies further image information. The symmetry of 
5 this image information is determined, and the existence 
of a vehicle travelling in front is predicted by 
determining a level of stability of a horizontal 
movement of a line of symmetry and a second level of 
stability of the distances over time. 



This known registration device is used for the purpose 
of registering and recognizing vehicles located in 
front of the moving vehicle. The reliable recognition 
of objects is achieved only in the near region, 

15 however, since there the simple registration of lines 
of symmetry can be carried out with sufficient 
stability. In the remote region, this simple 
registration of symmetry is no longer adequate on its 
own because of the low resolution in the image and the 

20 resulting inaccuracy in the determination of the 
obj ect . 

However, high requirements have to be placed on 
reliable object recognition in particular, in order 

25 that the driver is not given any erroneous information, 
which can lead to erroneous and hazardous reactions. In 
the case of intelligent systems, the vehicle itself 
could react in a manner presenting a traffic hazard, on 
the basis of the erroneous information. Reliable 

30 information is imperative, for example in the case of 
the accurate-lane recognition of vehicles at a 
considerable distance, both in and counter to the 
actual direction of travel. 

35 For the recognition of interesting patterns, 
DE 42 11 171 Al proposes a method which applies the 
cross relation of small singular extracts from the 
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entire pattern of interest by means of block-by-block 
progressive image recognition via a trained 
classification network. 

5 DE 43 08 776 C2 discloses a device for monitoring the 
outer space around a vehicle which is travelling over 
one lane on a road, the said lane being defined by 
extended white lines. By means of image processing, the 
course of the road is determined by using three- 

10 dimensional position information from sections of the 
white lines. By utilizing the three-dimensional 
position information from the white lines / the white 
lines are separated from three-dimensional objects. For 
each section, the vertical extent of possible objects 

15 is determined. As a result, the coordinates for three- 
dimensional objects of interest, such as motor 
vehicles, motor cycles or pedestrians, can be defined 
in the coordinate system of the vehicle. In addition, 
it is possible to detezrmine which object is concerned . 

20 

The procedure described in DE 43 08 776 C2 for 
monitoring the outer space around a vehicle requires a 
great deal of computation. It is always necessary to 
determine the course of the registered region of the 

25 road, in order to be able to determine the position of 
objects in this road course. Since only a limited 
amount of computing power is available in a motor 
vehicle, such a monitoring device is little suited to 
practical use. In addition, the known monitoring device 

30 is always referred to the presence of white boundary 
lines, which may not be found on the course of all 
roads . 

EP-A-0 874 331 discloses the practice of dividing up a 
35 distance image into regions in the lateral direction 
away from the vehicle. In this case, a histogram 
relating to the distance values in the individual 
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regions is drawn up, in order to determine the 
distances of individual objects from these histograms. 
The possibility of a collision or contact with objects 
or other vehicles on the roadway is determined from the 
5 position and size of the objects or vehicles. The 
relative speed of the objects in relation to the 
particular vehicle is determined by tracking the 
objects. A reliable statement relating to the relevance 
of the objects to the situation is possible only after 
10 a very computationally intensive procedure, which calls 
a practical application in road vehicles into question. 

The object of the invention is to specify a method of 
detecting objects in the vicinity of a road vehicle up 
15 to a considerable distance which permits the reliable 
registration of objects, in particular of vehicles in 
front of and/ or behind the road vehicle and their 
relevance to the situation on the basis of its position 
relative to the road vehicle. 

20 

According to the invention, this object is achieved by 
the features of Patent Claim 1. The subclaims relate to 
advantageous developments of the subject of the 
invention . 

25 

Accordingly, a method of detecting objects in the 
vicinity of a road vehicle up to a considerable 
distance is provided, in which the distance from a 
moving or stationary vehicle to one or more objects is 
30 calculated by distance-based image segmentation by 
means of stereo image processing, and characteristics 
of the detected objects are determined by object 
recognition in the segmented image regions. 

3 5 Determining the characteristics of the detected objects 
is intended to serve to clarify their relevance to the 
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particular vehicle and therefore contribute to the 
understanding of the situation. 

The detection can preferably be carried out to the 
front or to the rear and employed, for example, to warn 
of jams, for distance control from the vehicle in front 
or for monitoring the rear space. In this case, an 
important point of view is that the relevance to the 
situation or the potential hazard of the detected 
objects is determined from their distance to the 
particular vehicle and the determined relative speed. 

Instead of evaluating pairs of stereo images, which are 
recorded by a stereo arrangement comprising optical 
sensors or cameras, in principle, even individually 
recorded images of different origin can be evaluated in 
order to determine the distance. 

Image regions of elevated objects and/or flat objects 
are determined. Elevated objects and/or flat objects 
are detected by combining 3D points in accordance with 
predetermined criteria. Combining is also designated 
clustering. In this case, the elevated objects are 
determined through features with similar distance 
values and flat objects are determined through features 
with similar height values. By means of this procedure, 
objects can be recognized and assessed not only 
reliably with regard to their distance but also with 
regard to specific features. Distinguishing between 
elevated and flat objects is therefore easily possible. 

Features of similar distance values and/or similar 
height are combined in order to form clusters. This 
distinction between elevated and flat objects is very 

important for reliable object recognition, for example 
the recognition of other motor vehicles, and the 
distinction from road markings. Since appropriately 
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high computing powers can be implemented nowadays in 
modern motor vehicles, image segmentation of this type 
by means of distance determination and clustering can 
be carried out reliably and quickly. 

The relevant objects are followed over time and their 
distance and lateral position relative to the 
particular vehicle are determined, in order to assess 
the dynamic behaviour of the relevant objects. Only 
with knowledge of the dynamic behaviour of the 
determined objects are practical reactions of the 
driver or of the vehicle possible. An "anticipatory" 
mode of driving is therefore promoted. 

Furthermore, by means of this tracking, as it is known, 
phantom objects which occur sporadically can be 
suppressed, and the entire recognition performance can 
be increased. In this way, the number of extracted 
image regions to be classified in the image can be 
reduced, if these are checked for their local 
consistency by means of simple time tracking. By means 
of tracking the detected objects over time, the object 
characteristics, such as the distance, relative speed 
and relative acceleration, can be freed of measurement 
noise, for example by using a Kalman filter. 

For the purpose of object recognition, object 
hypotheses are determined, which are verified by 
comparison with object models. 

In this way, for the purpose of object recognition, the 
segmented image regions may be scanned in accordance 
with predetermined, statistically verified 2D features 
of . the objects to be recognized, and the detected 
objects may be compared by using a neural network for 
the classif i<:ation of a specific object type. In this 
way, reliable object recognition is carried out. 
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The detected elevated objects may be, in particular, 
road vehicles, signposts, bridge coluinns, lamp posts 
and so on, whereas the detected flat objects may be, in 
particular, road markings and boundaries such as curb 
5 stones, crash barriers and so on. In this way, for 
example, the position of a road vehicle on a specific 
road lane can be determined in a simple way. 

In addition, it is advantageous to know the relative 
10 position and the relative speed of the detected objects 
relative to one another and to the moving vehicle, in 
order to determine the relevance of the detected 
objects to the situation. To this end, the distance 
measurement is evaluated, and an accurate road- lane 
15 object association is determined. 

During the image segmentation, one of the recorded 
pairs of stereo images can be scanned for significant 
features of objects to be registered. The spacing of 

2 0 the significant features may then be determined by 
means of cross-relation by comparing the respective 
features in a stereo image from the pair of stereo 
images with the same, corresponding features in the 
other stereo image from the pair of stereo images, 

25 recorded at the same time, the disparities which occur 
being evaluated. 

By determining the spacing of significant features in 
the pixel range, 3D points in the real world are 
30 determined relative to the coordinate system of the 
measuring device. The information obtained in this way 
from 3D points is therefore determined from different 
objects, such as vehicles, road markings, crash 
barriers, and so on. 

35 

In addition to the above-described stereo-based 
approach, in principle object registration methods 
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based on radar and/or infrared signals in the remote 
range are also possible. 

Further advantages, features and details of the 
5 invention become clearer by using the following 
description in conjunction with the appended drawings, 
in which: 

Fig. 1 shows a schematic representation of the method 
10 steps according to the invention; 

Fig. 2 shows a schematic representation to clarify the 
principle of the distance determination in the 
case of cameras with the same focal length 
15 arranged in parallel; 

Fig. 3 shows a schematic representation to clarify the 
' principle of the correspondence search by means 
of cross correlation; 

20 

Fig. 4 shows a schematic representation to clarify the 
"^principle of the 2D feature extraction in the 
case of evaluation by a neural network 
according to the invention; 

25 

Fig. 5 shows a schematic representation to clarify the 
principle of coordinate normalization; and 

Fig. 6 shows a representation of a distance profile of 
30 an approaching vehicle. 

In the following text, the image segmentation 1 by 
means of stereo image processing is described, during 
which elevated objects 2 are detected. This is carried 
35 out through clustering 3 individual features with 
similar distances. Then, a vehicle recognition method 
5, 6 will be presented, with which road vehicles in the 
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segmented image regions are recognized. For this 
purpose, features typical of vehicles are extracted 6 
and then compared with the internal vehicle model 
depiction 5 from a neural network 8. The basic 
5 procedure is shown schematically in Figure 1. 

Mono image processing is in principle also possible, 
given the use of similar means and a similar procedure. 

10 The characteristic that road vehicles are elevated by 
comparison with the road is used for the method of 
image segmentation presented here. To this end, use is 
made of a stereo camera system, with which it is 
possible to determine the distances of significant 

15 features which occur in the camera image on road 
vehicles. By means of this information, a statement 
about elevated objects 4 is possible. The continually 
increasing computing power which is available in the 
vehicle nowadays permits real-time analysis of pairs of 

20 stereo images. 

It is also possible to determine reliably on which lane 
a registered road vehicle is located. It is then 
possible to make a statement about the relevance of 
25 this registered road vehicle to the situation, on the 
basis of its position relative to the particular 
vehicle. The driver and/or the particular vehicle can 
then react accordingly. 

30 Although radar systems suitable for vehicles do not 
offer adequate lateral resolution for lane association, 
infrared systems have resolution and range problems and 
ultrasound can generally be used for the near range, it 
is in principle conceivable to employ these systems 

35 instead of or in combination with stereo camera 
systems . 
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The principle of distance determination in the case of 
the parallel camera arrangement used is represented in 
Figure 2 on the basis of the pinhole camera model. The 
point P in the world is projected onto the sensor 
5 surfaces of each camera via the projection centres, uo 
and ui represent the deviation from the projection 
centre. Their difference 

Au = Uo - ui 

is designated the disparity Au. By means of 
trigonometry and the sizes of the camera arrangement 
(focal length f and base width b) , the distance d can 
be calculated. 

Au 

Here, h represents the base width, f the focal length 
and d the distance to the point P. uo and Ui are the 

2 0 distances of the projections of the point P onto the 

sensor surface. 

In the first processing step in the image segmentation, 
a search for significant features is carried out in one 
25 of the pairs of stereo images. A corresponding display 
(not shown) on a monitor or another display device may 
be provided only for research purposes. Significant 
features are supplied, for example, by edges, which 
occur reliably in the case of road vehicles. The 

3 0 locations of the selected edges, which define the image 

region to be correlated in the second processing step 
may be marked, for example, by means of rectangular 
frames in the monitor display. 

3 5 In order to determine the spacing of the features 
displayed on the monitor, the respective disparities 
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are deteinnined by comparison with the second stereo 
image recorded at the same time. To this end, a search 
is made in each rectangular image region by means of 
cross correlation in the corresponding image. Figure 3 
shows a schematic representation to clarify the 
principle of the correspondence search by means of 
cross correlation 11. 

On account of the parallel alignment of the cameras, 
the search region in the vertical direction may be 
restricted to the epipolars, the respective line in the 
case shown in Figure 3. In the horizontal direction, 
the corresponding search region is defined in the 
corresponding image 9, 10 in accordance with 
permissible disparities . 

By means of using KKFMF (the local, average-free, 
normalized cross correlation function) as the 
correlation function, lightness differences in the 
pairs of images 9, 10, which occur for example as a 
result of different solar radiation or different 
control of the cameras, have only a slight effect on 
the correlation value. 

The correlation coefficient from the KKFMF is 
calculated as follows: 



The values F(i,j) and pj^ + i^y-^ J ) represent the 
average- free grey values from the rectangular image 
regions F(i,j) and Pr(x+i,y+j). Because of the 
normalization, the results from the KKFMF move within 
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the interval [-1, 1] . The value 1 represents agreement 
in pairs, -1 represents correspondingly inverse 
agreement . 

5 In the last processing step in the image segmentation, 
combining (cluster formation) of features with similar 
distance values takes place (cf. Figure 1). The 
relative height of the clusters formed is compared with 
a fixed minimum height, in order to ensure an elevated 
10 object 2. In this case, elevated objects are determined 
through features with similar distance values, and flat 
objects are determined through features with similar 
height values . 

15 For research purposes, the resulting clusters can be 
inserted as frames into a (not shown) real monitor 
display of the observed scene. In addition, the 
distances belonging to the segmented image regions may 
be specified in numerical values on the frames. 

20 

In addition to vehicles, other elevated objects, such 
as sign posts and road margins, are also segmented. In 
order to discard erroneous object hypotheses, the 
stereo-based object segmentation process within the 
25 detected image regions is followed by 2D object 
recognition . 

In the, following text, the 2D feature extraction and 
the vehicle recognition will now be described. These 
30 processing steps are likewise shown in Figure 1. 

Road vehicles have significant features in the image 
plane, for example edges and corners, as well as 
symmetry. These features have been determined 
35 empirically for the purpose of a search, and the 
recognition of road vehicles is carried out by means of 
direct comparison with a vehicle model. In the method 
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shown here, a search is made in accordance with 
statistically verified 2D features 7, which are 
subsequently compared with the internal model depiction 
of vehicles from a neural network 8 . Figure 4 shows a 
5 schematic representation to clarify the principle of 
the 2D feature extraction during evaluation by a neural 
network . 

In order to determine significant and statistically 
10 verified 2D features 7 of road vehicles, a data set of 
50 images, which show cars in various scenes, was used 
as a basis. By using the method explained below, a 
determination of a plurality of 9x9 large typical 
patterns, which often occur in the scenes used, was 
15 carried out (referred to below as comparative 
patterns) . 

The comparative patterns typically occur at specific 
locations on the vehicle. For example, the features may 

2 0 occur in the lower region of the vehicles. At these 

locations, most road vehicles exhibit similar 
structural areas. These are, for example, the shadows 
under the car and the corners of the tyres, as well as 
the course of the structural areas at the head lamps. 

25 

In the segmented image regions, a search window is 
defined in order to calculate the features determined 
by means of the predefined comparative patterns. 
Depending on the distance of the hypothetical object, a 

3 0 search window of matched size is defined and correlated 

with the comparative patterns. The locations in the 
search window which exhibit a local maximxim of the 
correlation function identify significant features, as 
Figure 5 shows . 

35 

The coordinates of the extrema and the associated 
comparison patterns provide the input features for the 
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feed forward network used. This has been trained for 
the occurrence of typical combinations of features 
which identify vehicles. 



5 The real-time method according to the invention for the 
stereo-based tracking of objects at a considerable 
distance has been tried out in real road scenes. Figure 
6 represents the measured distance data from an 
approaching vehicle. As can be seen in Figure 6, a 
10 measurement inaccuracy of about ± 50 cm occurs at 
100 metres distance. 



In order to keep the determined distance data free of 
noise and largely free of measurement errors on account 

15 of erroneously determined correspondences, the use of a 
Kalman filter is suggested, which supplies more 
meaningful results as a result of the consideration of 
the measured values over time. By extending the 2D 
feature extraction by texture dimensions and symmetry 

20 opera,tions, further potential is provided for improving 
the method presented. 



In siimmary, it is to be recorded that, by using the 
method according to the invention, reliable distance 
25 determination and recognition of objects, in particular 
of road vehicles in front of and/or behind a travelling 
vehicle is possible up to a considerable distance. 
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New Patent Claims 

1. Method of detecting objects in the vicinity of a 
road vehicle up to a considerable distance, in 
5 which the distance from a moving or stationary 

vehicle to one or more objects is calculated by 
distance-based image segmentation (1) by means of 
stereo image processing, and characteristics of 
the detected objects are detearmined by object 
10 recognition in the segmented image regions, 

image regions of elevated objects and/or flat 
objects being determined (2), and 

elevated objects and/or flat objects being 
detected by combining (clustering) (3) 3D points 

15 in accordance with predetermined criteria, 

elevated objects being determined through features 
with similar distance values and flat objects 
being determined through features with similar 
height values, and 

20 the relevant objects (4) being followed over time 

(tracking) and their distance and lateral position 
relative to the particular vehicle being 
determined, in order to assess the dynamic 
behaviour of the relevant objects, 

25 characterized in that 

for the purpose of object recognition, object 
hypotheses are determined, which are verified by 
comparison with object models (5) , 

the segmented image regions being scanned in 
3 0 accordance with predetermined, statistically 

verified 2D features of the objects to be 
recognized (6, 7) , and 

the detected objects being compared by using a 
neural network (8) for the classification of a 
3 5 specific object type. 
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2. Method according to Claim 1, characterized in that 
the detected elevated objects (4) are in 
particular road vehicles and/or the detected flat 
objects are in particular road markings and 

5 boundaries. 

3. Method according to Claim 1 or 2, characterized in 
that the relative position and the relative speed 
of the detected objects relative to one another 

10 and to the moving vehicle are determined by 

evaluating the distance measurement, in order to 
determine an accurate road- lane object association 
and/or the relevance of the detected objects to 
the situation. 

15 

4. Method according to one of Claims 1 to 3, 
characterized in that 

one of the recorded pairs of stereo images 
(9, 10) is scanned for significant features of 

20 objects to be registered, and 

the spacing of the significant features is 
determined by comparing the respective features in 
a stereo image from the pair of stereo images with 
the same, corresponding features in the other 

25 stereo image from the pair of stereo images 

(9, 10), recorded at the same time, the 
disparities which occur being evaluated by means 
of cross correlation (11) . 

30 5. Method according to one of Claims 1 to 4, 
characterized in that by determining the spacing 
of significant features in the pixel range, 3D 
points in the real world are determined relative 
to the coordinate system of the measuring device. 



35 



Method according to one of Claims 1 to 5, 
characterized in that the objects are detected by 
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means of radar and/or infrared sensors and/or a 
stereo or mono arrangement of optical sensors or 
cameras . 
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(57) Abstract: The invention relates to a 
method of detecting objects within a wide 
range of a road vehicle. According to said 
method, the distance between a moving or 
stationary vehicle and one or moie objects 
is calculated by distance-based image 
segmentation using stereoscopic image 
processing techniques and the properties 
of the detected objects are determined by 
object recognition in the segmented image 
areas. Image areas of three-dimensional 
and/or flat objects are detected and said 
tbiee-dimensional and/or flat objects arc 
detected by clustering 3D pixels according 
to defined criteria. Three-dimensional 
objects are determined by features with 
similar distance values and flat objects by 
features with similar height values. 

(57) Zusammenfassung: Die Erfindimg 
bebifft ein Verfahren zur Detektion 
von Objekten im Umfeld eines Stras- 
senfahrzeugs bis in grosse Entfemung, 
bei welchem die Entfemung eines 
bewegten oder stehenden Fahrzeugs zu 
einem oder mehieren Objekten durch 
entfemungsbasierte Bildesegmentienmg 
mittels Stereobildveraibeitung berechnet 
wird und Eigenschaften der detektierten 
Objekte durch Objekterkennung in den 
segmentierten Bildbereichen 
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